To Ann - Sofie

نویسنده

  • Nikos Nikoleris
چکیده

To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep memory hierarchies including several levels of caches. For such microprocessors, the service time for fetching data from off-chip memory is about two orders of magnitude longer than fetching data from the level-one cache. Consequently, the performance of applications is largely determined by how well they utilize the caches in the memory hierarchy, captured by their miss ratio curves. However, efficiently obtaining an application’s miss ratio curve and interpreting its performance implications is hard. This task becomes even more challenging when analyzing application performance on multi-core processors where several applications/threads share caches and memory bandwidths. To accomplish this, we need powerful techniques that capture applications’ cache utilization and provide intuitive performance metrics. In this thesis we present three techniques for analyzing application performance, StatStack, StatCC and Cache Pirating. Our main focus is on providing memory hierarchy related performance metrics such as miss ratio, fetch ratio and bandwidth demand, but also execution rate. These techniques are based on profiling information, requiring both runtime data collection and post processing. For such techniques to be broadly applicable the data collection has to have minimal impact on the profiled application, allow profiling of unmodified binaries, and not depend on custom hardware and/or operating system extensions. Furthermore, the information provided has to be accurate and easy to interpret by programmers, the runtime environment and compilers. StatStack estimates an application’s miss ratio curve, StatCC estimates the miss ratio of co-running application sharing the last-level cache and Cache Pirating measures any desired performance metric available through hardware performance counters as a function of cache size. We have experimentally shown that our methods are both efficient and accurate. The runtime information required by StatStack and StatCC can be collected with an average runtime overhead of 40%. The Cache Pirating method measures the desired performance metrics with an average runtime overhead of 5%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keratin 8 modulates b-cell stress responses and normoglycaemia

Catharina M. Alam, Jonas S. G. Silvander, Ebot N. Daniel, Guo-Zhong Tao, Sofie M. Kvarnström, Parvez Alam, M. Bishr Omary, Arno Hänninen and Diana M. Toivola* Department of Biosciences, Cell Biology, Åbo Akademi University, Tykistökatu 6A, FIN-20520 Turku, Finland Department of Surgery, Stanford University School of Medicine, Stanford, California, USA Centre for Functional Materials, Åbo Akadem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011